NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Achieving 1/N Optimality Gap in Restless Bandits through Gaussian Approximation

Yan, Chen; Wang, Weina; Ying, Lei (December 2025, neurips)

Full Text Available
Achieving $$\tilde{O}(1/N)$$ Optimality Gap in Restless Bandits through Gaussian Approximation

Yan, Chen; Wang, Weina; Ying, Lei (December 2025, NeurIPS 2025)

Full Text Available
Achieving Exponential Asymptotic Optimality in Average-Reward Restless Bandits without Global Attractor Assumption

Hong, Yige; Xie, Qiaomin; Chen, Yudong; Wang, Weina (November 2025, NeurIPS 2025 Workshop MLxOR: Mathematical Foundations and Operational Integration of Machine Learning for Uncertainty-Aware Decision-Making)

Full Text Available
Lyapunov-Based Sample Complexity Analysis for Weakly-Coupled MDPs

Wu, Tianhao; Zurek, Matthew; Chen, Yudong; Wang, Weina; Xie, Qiaomin (November 2025, NeurIPS 2025 Workshop MLxOR: Mathematical Foundations and Operational Integration of Machine Learning for Uncertainty-Aware Decision-Making)

Full Text Available
Unichain and Aperiodicity Are Sufficient for Asymptotic Optimality of Average-Reward Restless Bandits

https://doi.org/10.1287/moor.2024.0678

Hong, Yige; Xie, Qiaomin; Chen, Yudong; Wang, Weina (December 2025, Mathematics of Operations Research)

We consider the infinite-horizon, average-reward restless bandit problem in discrete time. We propose a new class of policies that are designed to drive a progressively larger subset of arms toward the optimal distribution. We show that our policies are asymptotically optimal with an [Formula: see text] optimality gap for an N-armed problem, assuming only a unichain and aperiodicity assumption. Our approach departs from most existing work that focuses on index or priority policies, which rely on the Global Attractor Property to guarantee convergence to the optimum, or a recently developed simulation-based policy, which requires a Synchronization Assumption.
more » « less
Full Text Available
Job assignment in machine learning inference systems with accuracy constraints

https://doi.org/10.1016/j.peva.2024.102463

Choudhury, Tuhinangshu; Joshi, Gauri; Wang, Weina (March 2025, Performance Evaluation)

Full Text Available
Efficient Algorithms for Attributed Graph Alignment with Vanishing Edge Correlation

Wang, Ziao; Wang, Weina; Wang, Lele (June 2024, Proceedings of Machine Learning Research, 37th Annual Conference on Learning Theory)

Full Text Available
Attributed Graph Alignment

https://doi.org/10.1109/TIT.2024.3403810

Zhang, Ning; Wang, Ziao; Wang, Weina; Wang, Lele (August 2024, IEEE Transactions on Information Theory)

Full Text Available
Efficient Reinforcement Learning for Routing Jobs in Heterogeneous Queueing Systems

Jali, Neharika; Qu, Guannan; Wang, Weina; Joshi, Gauri (May 2024, Proceedings of The 27th International Conference on Artificial Intelligence and Statistics)

We consider the problem of efficiently routing jobs that arrive into a central queue to a system of heterogeneous servers. Unlike homogeneous systems, a threshold policy, that routes jobs to the slow server(s) when the queue length exceeds a certain threshold, is known to be optimal for the one-fast-one-slow two-server system. But an optimal policy for the multi-server system is unknown and non-trivial to find. While Reinforcement Learning (RL) has been recognized to have great potential for learning policies in such cases, our problem has an exponentially large state space size, rendering standard RL inefficient. In this work, we propose ACHQ, an efficient policy gradient-based algorithm with a low dimensional soft threshold policy parameterization that leverages the underlying queueing structure. We provide stationary-point convergence guarantees for the general case and despite the low-dimensional parameterization prove that ACHQ converges to an approximate global optimum for the special case of two servers. Simulations demonstrate an improvement in expected response time of up to ∼30 over the greedy policy that routes to the fastest available server.
more » « less
Full Text Available
On the Feasible Region of Efficient Algorithms for Attributed Graph Alignment

https://doi.org/10.1109/TIT.2024.3351107

Wang, Ziao; Zhang, Ning; Wang, Weina; Wang, Lele (May 2024, IEEE Transactions on Information Theory)

Full Text Available

« Prev Next »

Search for: All records